xen.git
16 years agoUpdate .hgignore for tools/libxc/.zlib.deps
Keir Fraser [Fri, 21 Aug 2009 16:13:17 +0000 (17:13 +0100)]
Update .hgignore for tools/libxc/.zlib.deps

16 years agodocs/misc: Update XSM Flask documentation
Keir Fraser [Fri, 21 Aug 2009 16:12:13 +0000 (17:12 +0100)]
docs/misc: Update XSM Flask documentation

Update the XSM Flask documentation to reflect the support for
policy.24, the updated policy and policy build infrastructure, and how
to enable the optional MLS policy.

Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
16 years agopygrub: Fix elilo handling after password patch.
Keir Fraser [Fri, 21 Aug 2009 16:11:40 +0000 (17:11 +0100)]
pygrub: Fix elilo handling after password patch.

Signed-off-by: Michal Novotny <minovotn@redhat.com>
16 years agoRevert 20105:979fd420311b
Keir Fraser [Fri, 21 Aug 2009 16:00:01 +0000 (17:00 +0100)]
Revert 20105:979fd420311b

16 years agolibxc: Remove minios-specific hack for generating .zlib.deps file
Keir Fraser [Fri, 21 Aug 2009 10:10:49 +0000 (11:10 +0100)]
libxc: Remove minios-specific hack for generating .zlib.deps file

It's not needed if one relative path is replaced.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agolibxenguest: Fix libbz2/liblzma dependency computation.
Keir Fraser [Thu, 20 Aug 2009 21:26:16 +0000 (22:26 +0100)]
libxenguest: Fix libbz2/liblzma dependency computation.

 1. Create an empty dep file if neither lib is installed
 2. Forcibly disable support for libs if building minios

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agodomain builder: Implement bzip2 and LZMA loaders
Keir Fraser [Thu, 20 Aug 2009 21:12:25 +0000 (22:12 +0100)]
domain builder: Implement bzip2 and LZMA loaders

Recent upstream kernels can be compressed using either gzip,
bzip2, or LZMA.  However, the PV kernel loader in Xen currently only
understands gzip, and will fail on the other two types.  The attached
patch implements kernel decompression for gzip, bzip2, and LZMA so
that kernels compressed with any of these methods can be launched.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agotools/flask/policy: Updates to policy and policy build infrastructure
Keir Fraser [Thu, 20 Aug 2009 20:15:24 +0000 (21:15 +0100)]
tools/flask/policy: Updates to policy and policy build infrastructure

The original xen policy infrastructure was based off of an early
version of refpolicy. Because of this there was a lot of cruft that
does not apply to building a policy for xen. This patch does several
things. First it cleans up the makefile as to remove many unnecessary
build targets. Second it fixes an issue that the policy build process
wasn't handling interface files properly. Third it pulls in the MLS
suppport functions from current ref policy and makes use of
them. Finally it updates the xen policy with new rules to address
changes in xen since the policy was last worked on, and provides
several new abstractions for creating domains.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
16 years agox86_64 hvm: Adjust COMPAT_VIRT_START for 32-bit HVM guests.
Keir Fraser [Thu, 20 Aug 2009 17:27:31 +0000 (18:27 +0100)]
x86_64 hvm: Adjust COMPAT_VIRT_START for 32-bit HVM guests.

The PV limit should not apply as there is no M2P table mapped into an
HVM guest's virtual address space.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoxm-test: Fix testcase '11_block_attach_shared_dom0' for up-to date
Keir Fraser [Thu, 20 Aug 2009 15:19:01 +0000 (16:19 +0100)]
xm-test: Fix testcase '11_block_attach_shared_dom0' for up-to date
linux kernels

New kernels have ext2 disabled by default.  This fix uses ext3 for
testcase 11_block_attach_shared_dom0.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agopygrub: Add password support
Keir Fraser [Thu, 20 Aug 2009 15:17:16 +0000 (16:17 +0100)]
pygrub: Add password support

It basically checks for the presence of password line in grub.conf
of the guest image and if this line is present, it supports both clear
text and md5 versions of the password. Editing the grub entries and
command-line are disabled when some password is set in domain's
grub.conf file but the password was not entered yet. Also, new option
to press 'p' in interactive pygrub has been added to allow entering
the grub password. It's been tested on x86_64 with PV guests and was
working fine. Also, the countdown has been stopped after key was
pressed, ie. the user is probably editing the boot configuration.

Signed-off-by: Michal Novotny <minovotn@redhat.com>
16 years agox86: shadow_alloc_p2m_page() should call shadow_prealloc() before shadow_alloc()
Keir Fraser [Thu, 20 Aug 2009 15:15:52 +0000 (16:15 +0100)]
x86: shadow_alloc_p2m_page() should call shadow_prealloc() before shadow_alloc()

shadow_alloc_p2m_page() fails to call shadow_prealloc() before calling
shadow_alloc().  In certain conditions, notably when PoD is being
exercised, this may cause shadow_alloc() to fail, crashing Xen.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agox86 vmx: Update EIP when appropriate during task switch
Keir Fraser [Thu, 20 Aug 2009 12:32:31 +0000 (13:32 +0100)]
x86 vmx: Update EIP when appropriate during task switch

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoFix xapi xm-tests.
Keir Fraser [Thu, 20 Aug 2009 09:30:53 +0000 (10:30 +0100)]
Fix xapi xm-tests.

There were a couple of small bugs in the xapi xm-test:
o outdated XenAPI calls were removed from testcase
  (02_xapi-vbd_basic)
o minor problem with XendLocalStorageRepository
  is fixed (missed list_images() function - which
  is moved from the XenQCoWStroageRepo to the common
  base class XendStorageRepository)
  which was detected running 02_xapi-vbd_basic.
o XenAPI session handling and connecting is fixed.
o 03_xapi-network_pos was rewritten and now uses
  XenAPI.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agoxm-test: Add status section to xm-test/README
Keir Fraser [Thu, 20 Aug 2009 09:27:37 +0000 (10:27 +0100)]
xm-test: Add status section to xm-test/README

The resport functionality is not removed because there is the hope
that somebody sets up the server side infrastructure.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agox86: Remove global percpu_mm_info structure, to make dataflow through
Keir Fraser [Thu, 20 Aug 2009 09:16:58 +0000 (10:16 +0100)]
x86: Remove global percpu_mm_info structure, to make dataflow through
mm code clearer.

The FOREIGNDOM method was just confusing and pointless. The deferred
TLB flushing is of questionable value now that much automatic flushing has to be
synchronous to avoid guest SMP races.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86: teardown_msi_irq is not needed.
Keir Fraser [Thu, 20 Aug 2009 07:26:51 +0000 (08:26 +0100)]
x86: teardown_msi_irq is not needed.

teardown_msi_irq logic is covered in destroy_irq,
so remove it to avoid freeing msi resource twice.

Signed-off-by: Xiantao Zhang<xiantao.zhang@intel.com>
16 years agox86: calculate nr_irqs_gsi correctly.
Keir Fraser [Thu, 20 Aug 2009 07:26:16 +0000 (08:26 +0100)]
x86: calculate nr_irqs_gsi correctly.

Should be a typo, this issue is introduced by Cset20076,
and it may break VT-d device assignment.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
16 years agoxend: Fix error caused by VT-d ACS patch.
Keir Fraser [Thu, 20 Aug 2009 07:25:41 +0000 (08:25 +0100)]
xend: Fix error caused by VT-d ACS patch.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
16 years agopygrub: Revert 19322:3118041f2259, as it breaks timeout=0 behaviour
Keir Fraser [Thu, 20 Aug 2009 07:23:33 +0000 (08:23 +0100)]
pygrub: Revert 19322:3118041f2259, as it breaks timeout=0 behaviour

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86: Fix arch/x86/xen.lds dependencies.
Keir Fraser [Wed, 19 Aug 2009 16:00:26 +0000 (17:00 +0100)]
x86: Fix arch/x86/xen.lds dependencies.

gcc can get the dependency target name wrong (appends .o).

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoAMD IOMMU: support "passthrough" and "no-intremap" parameters.
Keir Fraser [Wed, 19 Aug 2009 13:23:30 +0000 (14:23 +0100)]
AMD IOMMU: support  "passthrough" and "no-intremap" parameters.

Signed-off-by: Wei Wang <wei.wang2@amd.com>
16 years agoUpdate Xen Flask module to policy.24.
Keir Fraser [Wed, 19 Aug 2009 13:22:52 +0000 (14:22 +0100)]
Update Xen Flask module to policy.24.

This is a back-port of the latest SELinux code to Xen, adjusted
for Xen coding style and interfaces.  Unneeded functionality such
as most object context config data, handle_unknown, MLS field
defaulting, etc has been omitted.

Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
16 years agoxen-hvmctx: don't compile for ia64.
Keir Fraser [Wed, 19 Aug 2009 13:22:15 +0000 (14:22 +0100)]
xen-hvmctx: don't compile for ia64.

xen-hvmctx is a x86 specific tool so that it shouldn't compile for ia64.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
16 years ago[IA64] define BYTES_PER_LONG to fix compilation error.
Keir Fraser [Wed, 19 Aug 2009 13:21:56 +0000 (14:21 +0100)]
[IA64] define BYTES_PER_LONG to fix compilation error.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
16 years agox86 hvm: Clean up vlapic/vioapic/vmsi delivery.
Keir Fraser [Wed, 19 Aug 2009 13:13:52 +0000 (14:13 +0100)]
x86 hvm: Clean up vlapic/vioapic/vmsi delivery.

In particular, avoid intermediate delivery bitmaps which restrict
number of vcpus supported.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoxen pm trace utility cleanup
Keir Fraser [Wed, 19 Aug 2009 12:17:41 +0000 (13:17 +0100)]
xen pm trace utility cleanup

xenpm trace utility gtraceview cleanup

- add gtraceview help info on how to get raw data by xentrace
- make trace_exit_reason compiled in non-debug mode. trace_exit_reason
  can be enable/disabled by xentrace at runtime, so no need to disable
  it at build time.

Signed-off-by: Yu Ke <ke.yu@intel.com>
16 years agox86 hvm: Remove vendor-specific feature masking of 0x1:ECX.
Keir Fraser [Wed, 19 Aug 2009 12:16:50 +0000 (13:16 +0100)]
x86 hvm: Remove vendor-specific feature masking of 0x1:ECX.

Vendors are respecting each others bits.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
16 years agoxend: passthrough: check if a device is behind PCIe switch that lacks ACS
Keir Fraser [Wed, 19 Aug 2009 12:12:16 +0000 (13:12 +0100)]
xend: passthrough: check if a device is behind PCIe switch that lacks ACS

Imagine a PCIe switch, which doesn't support ACS (Access Control
Services), has 2 downstream ports: A and B, according to PCIe spec,
the PCIe switch should directly route the transaction that is from A
and to a device under B -- the Root Complex and IOMMU engine are
bypassed -- this doesn't work at all in the case of hvm guest and can
even incur potential security issue, so we should not allow such kind
of device assignment.

If all the intermediate PCIe swiches between a device and Root Complex
support and enable ACS, we can safely asssign the device to guest.

Cc: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
16 years agohotplug scripts: better same_vm checks
Keir Fraser [Wed, 19 Aug 2009 12:11:33 +0000 (13:11 +0100)]
hotplug scripts: better same_vm checks

currently the function same_vm in block-common.sh is the one
responsible for detecting if two block devices can be used at the same
time by two VMs. This can be allowed in few specific cases: when the
two VMs are actually the same VM and when the two VMs are the guest
and its stubdomain. We need to expand these exceptions to handle
properly save restore issues: this patch adds to the exceptions the
case when two VMs are the same VM because of save\restore races, and
when two VMs are the guest and the stubdomain of the previous guest,
again during save\restore.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
16 years agox86: miscellaneous emulator adjustments
Keir Fraser [Wed, 19 Aug 2009 12:02:31 +0000 (13:02 +0100)]
x86: miscellaneous emulator adjustments

Defer fail_if()-s as much as possible (in favor of possibly generating
exceptions), and avoid generating exceptions when not strictly
necessary.

Avoid fail_if()-s for simple return code checks (making the code that
used them consistent with other, longer existing code).

Eliminate redundant generate_exception_if()-s checking lock_prefix
(which is already covered by the general check prior to decoding
operands).

Also fix the testing code to add PROT_EXEC for the mapping that is
intended to have instruction executed from.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agox86-64: adjust emulation of control transfers
Keir Fraser [Wed, 19 Aug 2009 12:02:04 +0000 (13:02 +0100)]
x86-64: adjust emulation of control transfers

While Intel and AMD implementations differ in various respects when
it comes to non-default operand sizes of control transfer instructions
and segment register loads (lfs, lgs, lss), it seems to make senss to
(a) match their behavior if they agree and (b) prefer the more
permissive behavior if they don't agree:

- honor operand size overrides on near brances (AMD does, Intel
  doesn't)
- honor operand size overrides on far branches (both Intel and AMD do)
- honor REX.W on far branches (Intel does, AMD doesn't except on far
  returns)
- honor REX.W on lfs, lgs, and lss (Intel does, AMD doesn't)

Also, do not permit emulation of pushing/popping segment registers
other than fs and gs as well as that of les and lds (the latter are
particularly important due to the re-use of the respective opcodes as
VEX prefixes in AVX).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agox86: extend runstate area updates
Keir Fraser [Wed, 19 Aug 2009 12:01:41 +0000 (13:01 +0100)]
x86: extend runstate area updates

In order to give guests a hint at whether their vCPU-s are currently
scheduled (so they can e.g. adapt their behavior in spin loops),
update
the run state area (if registered) also when de-scheduling a vCPU.

Also fix an oversight in the compat mode implementation of
VCPUOP_register_runstate_memory_area.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agox86: Fix max_gsi calculation on systems with discontiguous GSI space.
Keir Fraser [Wed, 19 Aug 2009 11:58:15 +0000 (12:58 +0100)]
x86: Fix max_gsi calculation on systems with discontiguous GSI space.

From: Steven Smith <steven.smith@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoxm,xend: Remove tab indents
Keir Fraser [Wed, 19 Aug 2009 11:55:15 +0000 (12:55 +0100)]
xm,xend: Remove tab indents

Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
16 years agox86: Only allocate vpid for initialised vcpus.
Keir Fraser [Wed, 19 Aug 2009 11:54:43 +0000 (12:54 +0100)]
x86: Only allocate vpid for initialised vcpus.

Currently, 32 vpids are allocated for each
domain statically, it blocks to support more
vcpus for HVM domain, so remove the limit and
only allocate vpid for intilized vcpus. In this
way, vpid can be non-contiguous for vcpus of one
single domain.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
16 years agox86: Implement per-cpu vector for xen hypervisor
Keir Fraser [Wed, 19 Aug 2009 11:53:46 +0000 (12:53 +0100)]
x86: Implement per-cpu vector for xen hypervisor

Since Xen and Linux has big differece in code base, it
is very hard to port Linux's patch and apply it to Xen
directly, so this patch only adopts core logic of Linux,
and make it work for Xen.

Key changes:
1. vector allocation algorithm
2. all IRQ chips' set_affinity logic
3. IRQ migration when cpu hot remove.
4. Break assumptions which depend on global vector policy.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
16 years agox86: Change Xen hypervisor's interrupt infrastructure
Keir Fraser [Wed, 19 Aug 2009 11:53:04 +0000 (12:53 +0100)]
x86:  Change Xen hypervisor's interrupt infrastructure
from vector-based to IRQ-based.

In per-cpu vector environment, vector space changes to
multi-demension resource, so vector number is not appropriate
to index irq_desc which stands for unique interrupt source. As
Linux does, irq number is chosen to index irq_desc. This patch
changes vector-based interrupt infrastructure to irq-based one.
Mostly, it follows upstream linux's changes, and some parts are
adapted for Xen.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
16 years agox86: Change nr_irqs to nr_irqs_gsi.
Keir Fraser [Wed, 19 Aug 2009 11:52:38 +0000 (12:52 +0100)]
x86: Change nr_irqs to nr_irqs_gsi.

Currently, nr_irqs is only used for GSI irqs, change
the name to make its meaning more precise. And, also
this is the initial step to support irq allocation for
MSI interrupt source.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
16 years agogdbstub: Remove noisy message on every gdbstub entry.
Keir Fraser [Sun, 16 Aug 2009 07:46:08 +0000 (08:46 +0100)]
gdbstub: Remove noisy message on every gdbstub entry.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agostubdoms: parse bridge informations
Keir Fraser [Sun, 16 Aug 2009 07:45:04 +0000 (08:45 +0100)]
stubdoms: parse bridge informations

Currently the stubdom-dm script doesn't read the bridge of a vif
on xenstore, therefore all the vifs assigned to the stubdom always
belong to default bridge. This patch changes the behavior reading the
bridge from xenstore and adding the bridge to the stubdom config
file.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
16 years agoRevert 20066:135b350496fb
Keir Fraser [Sun, 16 Aug 2009 07:43:50 +0000 (08:43 +0100)]
Revert 20066:135b350496fb

16 years agoxen-hvmctx: a tool to print the HVM state of a running domain
Keir Fraser [Fri, 14 Aug 2009 16:26:23 +0000 (17:26 +0100)]
xen-hvmctx: a tool to print the HVM state of a running domain

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoxend: VBD QoS policy bits
Keir Fraser [Fri, 14 Aug 2009 16:10:11 +0000 (17:10 +0100)]
xend: VBD QoS policy bits

Add the ability to define VBD QoS policy in the xend layer.

Consider the following vbd entry:

vbd = [
   'phy:/dev/server/virtualmachine1-disk,xvda1,w,credit=3D5000/s@50ms',
]

This means that a VM may perform 5000 I/O operations per second, with
credit being replenished every 50 milliseconds.

The 'credit' xenstore value is by the blkback driver to ratelimit I/O
operations for the specific device.

Signed-off-by: William Pitcock <nenolod@dereferenced.org>
16 years agox86 mce: move mce quirks into separate files
Keir Fraser [Fri, 14 Aug 2009 16:09:39 +0000 (17:09 +0100)]
x86 mce: move mce quirks into separate files
Quirk handling is designed to easily add more quirks when needed
w/o messing around in the normal mce code.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agoxsm/flask: Fix AVC audit message format
Keir Fraser [Fri, 14 Aug 2009 16:08:38 +0000 (17:08 +0100)]
xsm/flask:  Fix AVC audit message format

Fix formatting of Flask AVC audit messages so that existing
policy tools can parse them.  After applying,
'xm dmesg | audit2allow' yields the expected result.

Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
16 years agoxsm/flask: Fix sidtab locking bug
Keir Fraser [Fri, 14 Aug 2009 16:08:12 +0000 (17:08 +0100)]
xsm/flask:  Fix sidtab locking bug

We do not need to use the _irqsave/irqrestore forms of spin locking
within the sidtab in Xen's XSM Flask module, and doing so triggers a
BUG_ON() within check_lock() when we subsequently call xmalloc().
This was preventing Xen from booting with XSM/Flask enabled if built
with debug=y. It appears that this broke upon the changes to xmalloc
in changeset 18379:14a9a1629590.

Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
16 years agoAMD IOMMU: Destroy passthru guests when IO pagetable allocation fails
Keir Fraser [Fri, 14 Aug 2009 16:07:23 +0000 (17:07 +0100)]
AMD IOMMU: Destroy passthru guests when IO pagetable allocation fails

Signed-off-by: Wei Wang <wei.wang2@amd.com>
Acked-by: Wei Huang <wei.huang2@amd.com>
16 years agox86: cleanup rdmsr/wrmsr
Keir Fraser [Fri, 14 Aug 2009 11:26:35 +0000 (12:26 +0100)]
x86: cleanup rdmsr/wrmsr

Use a 64bit value instead of extracting/merging two 32bit values.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86 mce: make debug messages less noisy
Keir Fraser [Fri, 14 Aug 2009 09:59:13 +0000 (10:59 +0100)]
x86 mce: make debug messages less noisy

On guest MCE read only print debug code when
a non-zero value has been read. Xen is too
noisy, otherwise.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agoVMX: issue an NMI rather than just calling the NMI handler
Keir Fraser [Fri, 14 Aug 2009 09:58:32 +0000 (10:58 +0100)]
VMX: issue an NMI rather than just calling the NMI handler
when the VMEXIT code indicates that an NMI has been raised.
Otherwise we might hit a real NMI while in the handler.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
16 years agohvm: handle access to MSR_AMD64_NB_CFG
Keir Fraser [Fri, 14 Aug 2009 09:57:24 +0000 (10:57 +0100)]
hvm: handle access to MSR_AMD64_NB_CFG

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agox86: Remove EF_* duplicate defs for X86_EFLAGS_*.
Keir Fraser [Fri, 14 Aug 2009 07:36:12 +0000 (08:36 +0100)]
x86: Remove EF_* duplicate defs for X86_EFLAGS_*.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86: Do not clear EF.TF in crash-debug mode.
Keir Fraser [Fri, 14 Aug 2009 07:22:34 +0000 (08:22 +0100)]
x86: Do not clear EF.TF in crash-debug mode.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agogdbstub: Fix the build and make a few cleanups.
Keir Fraser [Thu, 13 Aug 2009 07:40:39 +0000 (08:40 +0100)]
gdbstub: Fix the build and make a few cleanups.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agogdbstub: Small fixes.
Keir Fraser [Wed, 12 Aug 2009 13:27:52 +0000 (14:27 +0100)]
gdbstub: Small fixes.

 * Correctly handly EFLAGS.TF in the hypervisor
 * Register value sent with 'P' command is in native byte order.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86 numa: fix nodes' memory parsing when SRAT table includes future-hotplug memory...
Keir Fraser [Wed, 12 Aug 2009 13:16:09 +0000 (14:16 +0100)]
x86 numa: fix nodes' memory parsing when SRAT table includes future-hotplug memory range

A node's future-hotplug memory range starts from very high end
normally, e.g. 1TB, and is not continuous with its current existing
memory range. It should not be covered by the global variable 'nodes'
as it assumes the node's memory is continuous. Otherwise it can make
nodes' memory ranges become very big and overlapped, and
populate_memnodemap() fails.

We can ignore future-hotplug memory range for now. Physical memory
hotplug support in future will handle it.

Signed-off-by: Yang Xiaowei <xiaowei.yang@intel.com>
16 years agox86 svm: Fix PAT MSR handling when using Nested Paging.
Keir Fraser [Wed, 12 Aug 2009 13:13:54 +0000 (14:13 +0100)]
x86 svm: Fix PAT MSR handling when using Nested Paging.

Accesses to the MSR should not be intercepted.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86 svm: Fix the build: vlapic_get_reg() takes two arguments.
Keir Fraser [Wed, 12 Aug 2009 13:13:00 +0000 (14:13 +0100)]
x86 svm: Fix the build: vlapic_get_reg() takes two arguments.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agotmem: one-liner correcting stat parsing ordering
Keir Fraser [Wed, 12 Aug 2009 13:06:30 +0000 (14:06 +0100)]
tmem: one-liner correcting stat parsing ordering

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
16 years agox86 svm: Fix checked builds of Windows running on AMD SVM
Keir Fraser [Wed, 12 Aug 2009 13:06:01 +0000 (14:06 +0100)]
x86 svm: Fix checked builds of Windows running on AMD SVM

Checked builds of Windows will, after every modification of the TPR,
read it back again and assert that the value read back matches with
the value written, including the priority sub-class.  Make sure that
we correctly preserve it on vmexit.

As far as I can tell from reading the documentation, the sub-class
doesn't actually do anything, so this should be pretty harmless.

Signed-off-by: Steven Smith <steven.smith@eu.citrix.com>
16 years agoxentrace: fix "%016x" format
Keir Fraser [Tue, 11 Aug 2009 06:36:26 +0000 (07:36 +0100)]
xentrace: fix "%016x" format

xentrace_format cannot use "0x016x" format as we expect.
It show only %(N) as "0x016x" format, not as "%(N+1)08x%(N)08x".
So I fixed tools/xentrace/formats by using "%(N+1)08x%(N)08x" format.
Also I added some TRC_PV entries.

Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
16 years agolibxc: Include private Xen headers in stubdom libxc build
Keir Fraser [Tue, 11 Aug 2009 06:34:55 +0000 (07:34 +0100)]
libxc: Include private Xen headers in stubdom libxc build

The headers libelf.h and elfstructs.h were removed from
xen/include/public in 19011:7df072566b8c.  But this broke the stubdom
build because parts of libxc depend on them.  This patch adds
$(XEN_ROOT)/xen/include/xen to the stubdom -I path.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
16 years agoUpdate QEMU_TAG to a83d119cfcc20bc7edb427992d6e31b3e99430be
Keir Fraser [Mon, 10 Aug 2009 17:15:19 +0000 (18:15 +0100)]
Update QEMU_TAG to a83d119cfcc20bc7edb427992d6e31b3e99430be

16 years agoRevert alloc_idle_vcpu() to support multiple idle domains where max
Keir Fraser [Mon, 10 Aug 2009 12:51:28 +0000 (13:51 +0100)]
Revert alloc_idle_vcpu() to support multiple idle domains where max
vcpus is less than max pcpus (e.g., can happen on i386).

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86: make mce debug output more verbose
Keir Fraser [Mon, 10 Aug 2009 12:33:01 +0000 (13:33 +0100)]
x86: make mce debug output more verbose

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agox86: Remove cpumask.h inclusion from mm.h
Keir Fraser [Mon, 10 Aug 2009 12:32:02 +0000 (13:32 +0100)]
x86: Remove cpumask.h inclusion from mm.h

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agopygrub: Remove bogus log.debug line.
Keir Fraser [Mon, 10 Aug 2009 12:30:50 +0000 (13:30 +0100)]
pygrub: Remove bogus log.debug line.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agotmem: expose freeable memory
Keir Fraser [Mon, 10 Aug 2009 12:27:54 +0000 (13:27 +0100)]
tmem: expose freeable memory

Expose tmem "freeable" memory for use by management tools.

Management tools looking for a machine with available
memory often look at free_memory to determine if there
is enough physical memory to house a new or migrating
guest.  Since tmem absorbs much or all free memory,
and since "ephemeral" tmem memory can be synchronously
freed, management tools need more data -- not only how
much memory is "free" but also how much memory is
"freeable" by tmem if tmem is told (via an already
existing tmem hypercall) to relinquish freeable memory.
This patch provides that extra piece of data (in MB).

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
16 years agotools: Fix iptables failure test in vif-common.sh
Keir Fraser [Fri, 7 Aug 2009 16:31:27 +0000 (17:31 +0100)]
tools: Fix iptables failure test in vif-common.sh

In changset 19540 a bug was introduced in the fib_iptable function in
vif-common.sh that incorrectly checks the exit status of iptables --
it always believes iptables has failed even when it hasn't.

The attached patch fixes that.  It's also bug 1490.

Signed-off-by: John Haxby <john.haxby@oracle.com>
16 years agox86: replace PAT initialisation magic value with a #define
Keir Fraser [Fri, 7 Aug 2009 16:30:33 +0000 (17:30 +0100)]
x86: replace PAT initialisation magic value with a #define

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agox86: Increase default max CPUs to 64.
Keir Fraser [Fri, 7 Aug 2009 16:29:50 +0000 (17:29 +0100)]
x86: Increase default max CPUs to 64.

Also remove compile-time limit of 32 for i386. It is no longer
required, since a cpumask was moved out of struct page_info.

Signed-off-by: Wei Gang <gang.wei@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86 p2m: use common p2m ops in common p2m code path
Keir Fraser [Fri, 7 Aug 2009 16:23:11 +0000 (17:23 +0100)]
x86 p2m: use common p2m ops in common p2m code path

We found recently there was an assertion failure when EPT mode is
enabled on 32PAE host when debug=y is used. The patch attached fixes
that. It uses the common p2m ops in the
common p2m code path p2m_remove_page rather than calling
p2m_gfn_to_mfn() for only shadow mode.

Signed-off-by: Xin, Xiaohui <xiaohui.xin@intel.com>
16 years agoxend: Rename device backend value when xm save/migrate
Keir Fraser [Fri, 7 Aug 2009 16:22:04 +0000 (17:22 +0100)]
xend: Rename device backend value when xm save/migrate

The Xend has a problem that it often fails to restore/migrate
a PV domain whose device backends are partly a driver domain.

Because a checkpoint of the PV domain has device backend value as
domain id, you can restore/migrate the PV domain only when a driver
domain is the same id as device backend value in the checkpoint.

I attached a patch to fix it by renaming device backend value in a
checkpoint from domain id to domain name when xm save/migrate.

This patch doesn't rename device backend value if the value is 0,
which is Domain-0, so the checkpoint format is compatible if you use
only Domain-0 as device backend.

Signed-off-by: Rikiya Ayukawa <ayukawa.rikiya@jp.fujitsu.com>
16 years agox86_emulate: Fixes for 'mov rm16,sreg'
Keir Fraser [Fri, 7 Aug 2009 09:53:22 +0000 (10:53 +0100)]
x86_emulate: Fixes for 'mov rm16,sreg'

1. Memory reads should be 16 bits only
2. Attempt to load %cs should result in #UD

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86_emulate: protmode_load_seg() cannot load system segments in long mode.
Keir Fraser [Fri, 7 Aug 2009 08:54:43 +0000 (09:54 +0100)]
x86_emulate: protmode_load_seg() cannot load system segments in long mode.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agohvmloader: Regression tests need 16MB to run. Check for this.
Keir Fraser [Thu, 6 Aug 2009 10:14:48 +0000 (11:14 +0100)]
hvmloader: Regression tests need 16MB to run. Check for this.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoept: code clean up and formatting.
Keir Fraser [Thu, 6 Aug 2009 09:02:20 +0000 (10:02 +0100)]
ept: code clean up and formatting.

Fix alignment and comments and add and remove spaces and lines where
appropriate.

Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
16 years agox86_emulate: Remove cmpxchg retry loop from protmode_load_seg().
Keir Fraser [Thu, 6 Aug 2009 08:54:22 +0000 (09:54 +0100)]
x86_emulate: Remove cmpxchg retry loop from protmode_load_seg().

It is safer to retry in a loop via the caller.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agotmem: Remove bogus variable decl, fixing build.
Keir Fraser [Thu, 6 Aug 2009 08:53:37 +0000 (09:53 +0100)]
tmem: Remove bogus variable decl, fixing build.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agotmem: save/restore/migrate/livemigrate and shared pool authentication
Keir Fraser [Thu, 6 Aug 2009 08:19:55 +0000 (09:19 +0100)]
tmem: save/restore/migrate/livemigrate and shared pool authentication

Attached patch implements save/restore/migration/livemigration
for transcendent memory ("tmem").  Without this patch, domains
using tmem may in some cases lose data when doing save/restore
or migrate/livemigrate.  Also included in this patch is
support for a new (privileged) hypercall for authorizing
domains to share pools; this provides the foundation to
accomodate upstream linux requests for security for shared
pools.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
16 years agoept mtrr: replace unsigned long with mfn_t for mfns.
Keir Fraser [Thu, 6 Aug 2009 08:15:42 +0000 (09:15 +0100)]
ept mtrr: replace unsigned long with mfn_t for mfns.

Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
16 years agoept p2m: replace unsigned long with mfn_t for mfns.
Keir Fraser [Thu, 6 Aug 2009 08:15:24 +0000 (09:15 +0100)]
ept p2m: replace unsigned long with mfn_t for mfns.

Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
16 years agoept p2m: set rwx flags to 0 for invalid and mmio_dm types.
Keir Fraser [Thu, 6 Aug 2009 08:14:52 +0000 (09:14 +0100)]
ept p2m: set rwx flags to 0 for invalid and mmio_dm types.

Read/write/execute flags are set to 1 before calling the type_to_flags
function which sets them to their appropriate values depending on the
p2m type. However, in invalid, mmio_dm, and default/unknown cases in
type_to_flags just falls through, unsafely leaving full access to
these pages.

Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
16 years agoRevert 20006:edf21ab7d7a4 and 20023:2b28320c6f8c.
Keir Fraser [Wed, 5 Aug 2009 13:56:29 +0000 (14:56 +0100)]
Revert 20006:edf21ab7d7a4 and 20023:2b28320c6f8c.

16 years agoRevert to pulling QEMU GIT repo via HTTP.
Keir Fraser [Wed, 5 Aug 2009 13:39:46 +0000 (14:39 +0100)]
Revert to pulling QEMU GIT repo via HTTP.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoxend: Remove _setSchedParams
Keir Fraser [Wed, 5 Aug 2009 13:03:38 +0000 (14:03 +0100)]
xend: Remove _setSchedParams

Currently, xc.sched_credit_domain_set is called twice when domains
are created.

start@XendDomainInfo
  _constructDomain
    xc.sched_credit_domain_set  --- 1st
  _initDomain
    _setSchedParams
      domain_sched_credit_set
        xc.sched_credit_domain_set  --- 2nd

resume@XendDomainInfo
  _constructDomain
    xc.sched_credit_domain_set  --- 1st
  _setSchedParams
    domain_sched_credit_set
      xc.sched_credit_domain_set  --- 2nd

This patch removes _setSchedParams method added by changeset 19955,
because xc.sched_credit_domain_set was added into _constructDomain
method by changeset 20006.

Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
16 years agox86 vmx: Accelerate VLAPIC EOI writes
Keir Fraser [Wed, 5 Aug 2009 13:02:46 +0000 (14:02 +0100)]
x86 vmx: Accelerate VLAPIC EOI writes

Our testing indicates that most apic accesses are eoi writes. This
patch accelerate guest EOI emulation utilizing HW VM Exit
information.

Without this patch, xentrace shows the apci access average tsc costs
is ~7.8k in our case and it down to ~3k with it. We also save 3% cpu
in our case.

From: Yang Zhang <yang.zhang@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86: CPU synchronization while doing MTRR register update
Keir Fraser [Wed, 5 Aug 2009 12:50:36 +0000 (13:50 +0100)]
x86: CPU synchronization while doing MTRR register update

The current Xen code does not synchronize all the cpus while
initializing MTRR registers when a cpu comes up.=20

As per IA32 SDM vol 3: Section: 10.11.8 MTRR Considerations in MP
Systems, all the processors should be synchronized while updating
MTRRs.

Processors starting with westmere are caching VMCS data for better VMX
performance. These processors also has Hyper-threading support. With
hyper-threading, when one thread's cache is disabled, it also disables
cache for the sibling threads. And MTRR register updating procedure
involves cache disabling. So if cpus are not synchronized, updating
MTRR registers on a thread, results in the VMCS data from sibling
threads becoming inaccessible, and it causes system failure.

With this patch while updating the MTRR registers, all the cpus are
synchronized as per the IA32 SDM. Also at the boot time and resume
time when multiple cpus are brought up, an optimization is added to
delay the MTRR initialization until all the cpus are up, to avoid
multiple times cpu synchronization.

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Asit K Mallick <asit.k.mallick@intel.com>
16 years agox86: Enable GNTTABOP_copy hypercall for HVMs
Keir Fraser [Wed, 5 Aug 2009 12:49:35 +0000 (13:49 +0100)]
x86: Enable GNTTABOP_copy hypercall for HVMs

This requires plumbing 32-bit compat guests through the compat version
of teh grant-table hypercall.

Signed-off-by: Jayaraman, Bhaskar <Bhaskar.Jayaraman@lsi.com>
16 years agoxm-test restore: use ext3 (instead of ext2) and xvda (instead of hda)
Keir Fraser [Wed, 5 Aug 2009 12:40:21 +0000 (13:40 +0100)]
xm-test restore: use ext3 (instead of ext2) and xvda (instead of hda)

This patch fixes the xm-test restore 04 testcase:
o uses ext3 instead of ext2 - which is not supported by the standard
kernel config
o uses xvdX instead of hdX for disks

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agoxm-test: Disable DEBUG_STACK_USAGE which breaks test cases
Keir Fraser [Wed, 5 Aug 2009 12:39:37 +0000 (13:39 +0100)]
xm-test: Disable DEBUG_STACK_USAGE which breaks test cases

The unnecessary 'used greatest stack depth' messages on the console
breaks xm-test cases by random.  Typically a testcase reads input from
the console and parses it.  When DEBUG_STACK_USAGE is enabled, these
stack usage messages are printed by random - the test case reads this
message, cannot handle it and fails.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agoxm-test: fix network13 test (protocol and extensions)
Keir Fraser [Wed, 5 Aug 2009 12:38:38 +0000 (13:38 +0100)]
xm-test: fix network13 test (protocol and extensions)

Attached there is a patch that fixes the used protocol (was udp - but
nobody was listening...) to icmp echo and added the extension, that
the dom0 and the other guest ips are also pinged.
Because of the many different scenarios (three nested loops) over
packet sizes, two guests and different ip addresses, one run of this
test case takes now about 4.5 minutes.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agoxm-test: Adapt memory setting to up-to-date kernel memory consumption
Keir Fraser [Wed, 5 Aug 2009 12:37:26 +0000 (13:37 +0100)]
xm-test: Adapt memory setting to up-to-date kernel memory consumption

The attached patch fixes xm-test memset 04 that it can be used with up
to date kernels.  The old version sets the memory to 15MByte which is
too low for modern kernels: the oom-killer in this case kills the
login shell of the test-case and init.  Increased the size to 18M
which gives the userspace about 2.5 MByte memory.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agoxm-test: 10_block_attach_detach_multiple_devices fixed
Keir Fraser [Wed, 5 Aug 2009 12:36:24 +0000 (13:36 +0100)]
xm-test: 10_block_attach_detach_multiple_devices fixed

This patch fixes and (re-)enables test 10 of the block-create suite.
The tests by random attach and detach devices to / from a domU and
checks if everything is ok.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agoxm-test block-create: use ext3 as filesystem
Keir Fraser [Wed, 5 Aug 2009 11:06:24 +0000 (12:06 +0100)]
xm-test block-create: use ext3 as filesystem

The current implementation uses ext2 for tests.  The tests currently
fail, because the current kernel does not support ext2 by default.
This patch creates an ext3 filesystem for the tests.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agoxend: fix memory leak resulting in long garbage collector runs
Keir Fraser [Wed, 5 Aug 2009 11:04:39 +0000 (12:04 +0100)]
xend: fix memory leak  resulting in long garbage collector runs

In the method xen.xend.XendStateStore.XendStateStore.load_state and
xen.xend.XendStateStore.XendStateStore.save_state the minidom objects
used to load/save the current state of a device type, can't be freed
by the python garbage collector after all references to the top node
are cleared, because of cyclic references between the DOM nodes. So
memory usage of xend increases after calling these methods.  To solve
this problem, the unlink() method must be called for a minidom object
before the last reference to the top node is cleared (see python
docs). This breaks the cyclic references, so the garbage collector can
free these objects.

Signed-off-by: juergen.gross@ts.fujitsu.com
16 years agoxend: pass-through: Extend multi-function mapping
Keir Fraser [Wed, 5 Aug 2009 11:03:53 +0000 (12:03 +0100)]
xend: pass-through: Extend multi-function mapping

This extends the mapping between physical and virtual PCI functions
for multi-function pass-through in two ways. If neither of these
rules apply the existing identity-mapping of physical to virtual
functions is used.

1) If physical function zero is not present in a multi-function
   pass-through device then the numerically lowest physical function
   whose virtual function hasn't explicitly been set will be mapped
   to virtual function 0.

   This is to satisfy the requirement that a (virtual) device
   must always have function 0 present.

2) The virtual function to be used for a physical function may
   be explicitly set.

   e.g. 00:1d.2=3D0,1=3D1,0=3D2@7 will result in the following
   mapping:

        physical | virtual
        ---------+--------
        00:1d.2  | 00:07.0
        00:1d.1  | 00:07.1
        00:1d.0  | 00:07.2

   Ranges may also be used with explicit assignment.
   The following would result in the same mapping as above:

       00:1d.2=3D0-0=3D2@7

Please be aware that it is very likely that using these extensions
it is possible to create mappings that do not work. If in doubt
please use identity-mapping.

Cc: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: passthrough: add checking when a device is hotplugged into pv guest.
Keir Fraser [Wed, 5 Aug 2009 11:03:08 +0000 (12:03 +0100)]
xend: passthrough: add checking when a device is hotplugged into pv guest.

When we 'xm pci-attach' device into pv guest, we also need to check if
the device is owned by pciback or pci-stub, if the device has been
assigned, etc.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
16 years agoAdd a single trigger for all diagnostic keyhandlers
Keir Fraser [Sun, 2 Aug 2009 12:43:15 +0000 (13:43 +0100)]
Add a single trigger for all diagnostic keyhandlers

Add a new keyhandler that triggers all the side-effect-free
keyhandlers.  This lets automated tests (and users) log the full set
of keyhandlers without having to be aware of which ones might reboot
the host.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>